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Abstract. We present a new, self-contained proof of the limitedness 
problem. The key novelty is a description using profinite words, which 
unifies and simplifies the previous approaches, and seamlessly extends the 
theory of regular languages. We also define a logic over profinite words, 
called MSO-+inf and show that the satisfiability problem of MSO+ 
reduces to the satisfiability problem of our logic. 


1 Introduction 


This paper is an attempt to establish a natural framework for problems related to 
the limitedness problem. A notable example of such a problem is the decidability 
of the logic MSO+ 


a:l 
Fig. 1. A distance automaton over 
a, the input alphabet {a,b}. 
b:0 


The limitedness problem was introduced by Hashiguchi [8] on his way to 
solving the famous star height problem. In its basic form, it concerns distance 
automata, i.e. nondeterministic automata, whose transitions are additionally la- 
beled by nonnegative, integer weights, such as the one depicted in Figure 1. A 
distance automaton is limited if there exists a bound n such that every accepted 
word has some accepting run whose sum of weights is bounded by n. Thus the 
limitedness problem is a decision problem which asks whether a given automaton 
is limited. The automaton in the example is not limited: the words a, a?,a°,... 
require accepting runs of ever larger weights. 

The logic MSO+B was introduced by Bojanczyk in his dissertation (see 
also [2]) in relation with a problem concerning modal p-calculus. It is an exten- 
sion of the usual MSO logic — over infinite trees or words — by the quantifier B, 
defined so that the formula BX.y(X) holds if and only if all the sets of positions 
X satisfying the formula vy in the given model have a commonly bounded size. 
A typical language of infinite words defined in this logic is: 


Zs 


Lpg = {a""ba"?b...: the sequence n1, nz, ... is bounded}. 


Note that this language is not w-regular, as its complement does not contain 
any ultimately periodic word. As a far-reaching project (see [3] for a survey), 


* Author supported by ERC Starting Grant “Sosna”. 


Bojanczyk posed the question of decidability of satisfiability of the logic MSO+ 
over infinite trees. Still, it is not even known to be decidable over infinite words. 

A syntactic fragment of the logic MSO+B has been shown decidable in [4]. 
The key tool used in this paper is a model of automata called wB-automata. 
Later, the authors discovered that limitedness of distance automata can be eas- 
ily decided using their results concerning wB-automata. The link with the limit- 
edness problem has been exploited in [6], where Colcombet defined B-automata 
and developed his theory of regular cost functions and stabilization semigroups. 
B-automata directly generalize distance automata, by allowing more than one 
counter which, moreover, can be reset. 


Our contribution is a theory which we believe to be the appropriate setting 
for considering limitedness of B-automata, and related problems. As a starting 
point, we see that B-automata naturally define languages of profinite words. The 
set of profinite words has a rich algebraic and topological structure, which we 
find very useful in the context of limitedness. 

For instance, consider the distance automaton from Figure 1. There is a profi- 
nite word, denoted a” (not to be confused with the infinite word) which witnesses 
the fact that the automaton is not limited — this word can be defined as the limit 
of the sequence of finite words (a”')°,. We say that this profinite word does 
not belong to the language of this automaton; the language of this automaton 
consists of profinite words which only have finitely many a’s, such as b or bY a. 

We call the class of languages of profinite words defined by B-automata B- 
regular languages. Our main result states that this class can be characterized 
in terms of logic, regular expressions and semigroups. The result generalizes the 
main results of the papers [11, 13,9, 1, 4], and implies the main result of [6, 7]. The 
description in terms of semigroups immediately implies decidability of the limit- 
edness problem for B-automata, which, in our framework is simply the question 
of language universality. In particular, together with Kirsten’s elegant reduction 
of the star height problem to the limitedness problem, our result gives yet an- 
other proof of decidability of the star height problem. The result also implies 
decidability of a more general problem — limitedness of Boolean combinations of 
B-automata. The remaining characterizations are primarily of conceptual value, 
as they manifest both that our framework is appropriate, and that the class 
of B-regular languages is robust. Note that most of these characterizations are 
also available in the framework of Colcombet. One exception is a new, finite- 
index characterization of B-regular languages, 4 la the Myhill-Nerode theorem; 
it seems that this result cannot be even phrased in the other frameworks. 

Lastly, we show that our framework is suited for dealing with the satisfiability 
problem for MSO+B over infinite words — we prove that this problem can be 
reduced to the satisfiability problem of a new logic MSO-+inf over profinite 
words, which we introduce here. This seems impossible in the other frameworks. 
In fact, our reduction is very general, and works for very many logics. The proof 
extends Biichi’s ideas, and consists of two key ingredients: convergent Ramsey 
factorizations of infinite words, and a model of deterministic automata over 
infinite words with a profinite acceptance condition. 


Related work. Several proofs of decidability of the limitedness problem exist [8, 
11,13, 9, 1,6]. Our proof builds on ideas from all of these papers, and simplifies 
them greatly. Hashigushi’s #-expressions acquire a new, concrete meaning in 
our framework, as simply defining profinite words. We extend Leung’s insight of 
considering the compact topological semigroup of all matrices over the tropical 
semiring, to considering the profinite semigroup. Also, Leung introduced finite 
versions of his topological semigroups, which are predecessors of stabilization 
semigroups of Colcombet. The factorization forests of Simon play a key role in 
the main technical part of our proof. The proof of Kirsten applies to a model 
very similar to B-automata, but with a hierarchical constraint on the counter 
operations. Kirsten generalized Leung’s proof, providing further instances of sta- 
bilization semigroups; however, the topological insights of Leung disappeared, as 
he no longer considered compact topological semigroups. 

Colcombet used ideas from [4] and of Kirsten in [7], where he developed his 
theory of regular cost functions. In his theory, a B-automaton defines a B-regular 
cost function — an equivalence class of number-valued functions. These cost func- 
tions also have equivalent descriptions in terms of regular expressions, logic and 
semigroups. The crucial discovery of that paper is the tight two-way correspon- 
dence between stabilization semigroups (defined there) and B-automata. Still, 
the topological insights of Leung remained missing. 

On a general level, and also on the level of proof structure, our approach 
resembles the approach of Colcombet. We outline the key differences. As we 
deal with languages which are subsets of a topological semigroup, many classical 
notions naturally lift to our setting — such as recognizable subsets, Myhill-Nerode 
equivalence, homomorphisms. In Colcombet’s framework, cost functions are not 
sets, and have no apparent algebraic nor topological structure (they only have 
a lattice structure, corresponding to the lattice ordering of languages). Because 
of this, the natural notions mentioned above do not exist, or have non-obvious 
definitions — an example is the complex notion of compatible mapping [6], which 
corresponds to our co-homomorphism. Even the notion of a Boolean combination 
of cost functions is meaningless. As a result, cost functions are not well-suited 
for the study of the full logic MSO+B. On a technical level, the proofs in [6,7] 
deal with the relative notions of “big” vs. “small” values, and this relativity needs 
to be carefully controlled in the calculations and proofs. In our more abstract 
setting, we deal with the absolute notions of infinite vs. finite, and computations 
involve usual set-theoretic equalities. 


Outline of the paper. First, we recall the definitions of B- and S-automata, and of 
profinite words. Next, we show how languages of profinite words can be defined 
using automata, regular expressions and logic. Then we present our main tech- 
nical tool — recognition by homomorphisms. In Section 5, we state the central 
result. Finally, we show a link between languages of infinite words and of profinite 
words. Due to space limitations, many details are deferred to the appendix. 
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2 Preliminaries 


Let us fix a finite alphabet A; finite words are assumed to be elements of A. In 
the examples, we will more concretely assume the alphabet A = {a,b}. By N 
we denote {0,1,2,...}, and by N we denote NU {w}. We treat N as a compact 
metric space, in which d(m,n) = |27™ — 2~”| (where 2~“ = 0). 
B-automata and S-automata (implicit in [4], defined in [6]) are nondeter- 
ministic automata over finite words, equipped with a finite number of counters. 
There are two counter operations available for each counter: inc increases the 
current value of the counter by 1 and reset sets the value to 0. A transition of 
a B- or S-automaton may trigger any sequence of operations on its counters. If 
the operation reset is performed in a run p on a counter which currently stores 
a value n, then we say that n is a reset value in the considered run p. The two 
models — B- and S-automata — differ in the semantics of the functions they define. 
First, consider a B-automaton A. Since A is nondeterministic, there might 
be many runs over a single word. For a particular run p, we define the value of p 
as its maximal reset value. Next, the valuation f4(w) of an input word w under 
the automaton A is the minimum of the values of all accepting runs p over w: 


fa(w) = minmax{n: in the run p, the value n is a reset value}. 
Pp 


Note that min ranges only over the accepting runs p of A. We assume max(@) = 0 
and min(@) = w, so if A has no accepting run over w, then f4(w) =w. 

If A is an S-automaton, the definition of a valuation f.4(w) of an input word w 
is completely dual — simply swap min with max in the formula above. 


Example 1 (The running example). Let A be the B-automaton with one counter 
which is depicted in the left-hand side of the figure below. 


a: inc a,b:e a: inc a,b: € 
He og reset B. $ bie yee 
b : reset reset reset 


We declare that the automaton resets its counter after reading the entire word 
— this extra feature can be easily eliminated using nondeterminism. Then, 


fa(w) = max{nj,no,...,nk} for w = a"tba”? ...ba”®. 


Now consider the S-automaton B depicted in the right-hand side of the figure. 
It has one counter, which is also assumed to be reset at the end of the run. The 
reader can check that each accepting run of 6 over an input word w corresponds 
to a block of a’s in w, and that fg(w) is the length of the largest block of a’s in 
w. Therefore, fg and f4 are precisely the same function from A* to N. 


Example 2. Let A be a finite nondeterministic automaton. If we view A as a B- 
automaton with no counters, the induced function assigns 0 to any word accepted 
by A and w to any rejected word. Dually, if we treat A as an S-automaton, the 
induced function assigns w to any accepted word, and 0 to any rejected word. 


A B- or S-automaton is said to be limited if the function fa has finite range 
(it may nevertheless contain the value w). The limitedness problem for B- or 
S-automata is then to decide whether a given B- or S-automaton is limited. The 
automata in the example are not limited, since f4(a”) =n for any n € N. 


Profinite words should be thought of as limits of sequences of finite words, with 
respect to all regular languages. A formal definition follows (see e.g. [12] for more 
details). We say that an infinite sequence w1, w2,... € At of finite (nonempty) 
words ultimately belongs to the regular language L C A*t if almost all the 
words w 1, W2,... belong to L. We say that a sequence of words is convergent, if 
for any regular language L, the sequence ultimately belongs to L or ultimately 
belongs to the complement of L. Every constant sequence is convergent. The 
sequence a, a?',a?',... is also convergent, as follows from a pumping argument for 
regular languages. However, the sequence a, a”, a°,... is not convergent, since the 
regular language (aa)* only contains every other of its elements. Two convergent 
sequences are equivalent if they belong ultimately to precisely the same regular 
languages. In other words, interleaving one sequence with the other yields a 
convergent sequence. An equivalence class of convergent sequences is a profinite 
word. A profinite word is uniquely specified by the set of regular languages to 
which it ultimately belongs. For example, the equivalence class of the convergent 
sequence a,a?',a®',..., which is a profinite word denoted a”, ultimately belongs 
to the languages at, (aa)*,(aaa)*,..., and does not ultimately belong to the 
languages a* -b-a* nor a: (aa)t. We denote profinite words by x,y,..., and 
the set of all profinite words by At. We define A* = A* U {e}, where e is the 
empty word. Note that the set of finite words At naturally embeds into the set 
of profinite words At, via constant convergent sequences. We call subsets of At 
or of A* languages of profinite words. 


The set of profinite words forms a semigroup: if w1, w2,... and v1, v2,... are 
two convergent sequences, then the sequence w 11, w2U2,... is also convergent. 
There is another important operation on profinite words, called the w-power. The 
w-power of a convergent sequence w1, W2, W3,... is the sequence wł, we', ws’, ae 
which also turns out to be convergent. This operation induces an operation 


z+ x” defined over profinite words. 


The set of profinite words carries a compact metric: the distance between 
two profinite words x,y is H, where n is the smallest size — measured as size of 
the minimal automaton — of a regular language L such that x ultimately belongs 
to L and y does not. This metric is compatible with the notion of convergence 
defined above. In particular, the set At of finite words is dense in the set of 
profinite words, AY: Multiplication and the w-power are continuous mappings 
over At. One can prove that z” = limpo £”! for any x € At. 


The closure L in At of any regular language L C At turns out to be both 
closed and open, i.e. clopen in At. Conversely, any clopen subset of A* is of 
the form L for some regular language L, so clopen sets correspond precisely to 
regular languages. Any open set in A* is a (possibly infinite) union of clopen sets. 


3 Languages of profinite words 


In this section we discuss several ways of describing languages of profinite words 
— via automata, regular expressions and logic. 

B- and S-regular languages. The essential idea underlying our theory is to 
consider B- and S-automata as processing not only finite words, but also profinite 
words. Let A be a B- or S-automaton. The following, simple observation relies 
on the fact that for each n € N, the language {w € At: fa(w) <n} is regular. 


Fact 1. Let wi,we,... be a convergent sequence of finite words. Then, the se- 
quence f(w1), fa(we),... is convergent in N = NU {w}. 


Therefore, it makes sense to define, for any x € F, 
— d f : 
fa(z) = lim fa(wn), 
noo 


where w1, w2,... is any sequence of finite words which converges to x. This 
value may happen to be w. It is straightforward to show that Fa is a well- 
defined continuous function from At to N. Moreover, by density of At in At, 
the continuous extension of f4 to A* is unique, so we will further identify fa 
with the continuous mapping fa: At >N. 

Similarly to the idea underlying cost functions [6], we do not care about the 
exact values of the function fa (this would quickly lead to undecidability, as 
demonstrated by Krob [10]). What we care about is over which sequences of 
words, fa grows indefinitely. By continuity of f4 and compactness of At, this 
is encoded in the set 

{re At: fa(x) =u}. 
This is a closed set as the inverse image of a point under a continuous mapping. 

This motivates the following definitions. For an S-automaton A, we define 
the set L(A) consisting of all profinite words x such that f(x) = w. For a 
B-automaton A, we define L(A) dually, as the language of all profinite words 
x such that fa(x) < w. In either case, we call L(A) the language recognized 
by A. The reason why the definitions differ is that S-automata try to maxi- 
mize, while B-automata try to minimize the value of a run. We call a language 
L C A* B-regular (respectively, S-regular), if it is recognized by a B-automaton 
(respectively, S-automaton). Note that S-regular languages are closed, and B- 
regular languages are open subsets of A*. In particular, a language is both B- 
and S-regular if and only if it is clopen. 


Example 8. Let A be the B-automaton from Example 1, computing the largest 
block of a’s. Then L(A) is the language of all profinite words for which every 
block of a’s has uniformly bounded length: 


L(A) ={xe At: fa(z) <w}= Jive At: x has no infix a”}. 
neN 


It is not difficult to show (using compactness and continuity of multiplication) 
that a profinite word has arbitrarily long blocks of a’s if and only if it contains 


a” as an infix. (We say that u is an infix of v if v = vı -u- v2 for some, potentially 
empty, profinite words v1, v2.) Therefore, if B is the S-automaton from Example 1 
(recall that fa = fg), we deduce that 


L(B) ={a¢€ At: fgle) =w} = At — L(A) = {a1 -a - £2: x1, 22 € AT}. 


Limitedness. Assume that we want to test for limitedness of a B-automaton A. 
It is easy to reduce the general case to the case when the underlying finite 
automaton accepts all finite words (to do this, it suffices to consider the disjoint 
union of A and A’, where A’ is a B-automaton which maps all words accepted 
by A to w, and the rest to 0). Then, an immediate compactness argument shows: 


Fact 2. A B-automaton A which accepts all finite words is limited iff L (A) = At. 


Closure properties. As usual with nondeterministic automata, both classes — of 
B- and S-regular languages — are closed under language projection, and also 
under union and intersection. They are not, however, closed under complements: 
the complement of the B-regular language L (A) from the previous example is not 
B-regular, since it is not an open set. However, this complement is an S-regular 
language, as it is equal to L (B). More generally, we will prove the difficult result 
that complements of B-regular languages are S-regular, and vice versa. 


The logic MSO-+ inf. We introduce the logic MSO+inf over profinite words. 
First, we define its base fragment, the logic MSO. A formula of this logic describes 
a set of profinite words. Usually, in the case of finite or infinite words, one sees 
such a word as a model whose elements are positions of the word, and so a formula 
of MSO speaks about sets of positions of the word. However, in profinite words, 
“positions” are not well-defined. To define the logic MSO over profinite words, 
we view the constructs of MSO as operations on languages of profinite words. 
We describe how to interpret the second-order existential quantifier 4; for the 
other constructs, the idea is even simpler. We view the quantifier 4 as language 
projection. What language do we project? A formula y(X) beneath a quantifier 
J defines a language Lọ over the extended alphabet A x {0,1}. For example, 
p(X) = a(X)Asingleton(X) defines the language Ly of those profinite words over 
A x {0,1}, which contain precisely one symbol (a, 1) and no other symbols with 
a 1 on the second coordinate. We define the language of the formula 4X .y(X) 
as the projection of the language Ly, forgetting about the second coordinate. 
Therefore, 1X.a(X) A singleton(X) describes the set of profinite words which 
have precisely one letter a. 

With similar ideas, it is easy to interpret all the usual constructs of MSO as 
language operations: the Boolean connectives A, V, =, the binary predicates <, € 
and the unary predicates a(X), per each letter a € A. This way, we define the 
semantic of the MSO logic over profinite words. This logic describes precisely 
the class of clopen sets. To go beyond that, we add a predicate inf(X) which 
holds in a profinite word over A x {0,1} if it has infinitely many 1’s on the 
second coordinate. This is a closed, but not open property of profinite words 
over the alphabet A x {0,1}, so it is not definable in MSO. We denote the logic 


MSO extended by the quantifier inf by MSO-+inf and distinguish the syntactic 
fragment MSO-+inf* (resp., MSO+inf ) where the predicate inf appears only 
under an even (resp. odd) number of negations. 


Example 4. Consider the S-regular language L (B) from Example 3: “there is an 
infinite block of a’s”. It can be described by the following formula of MSO-+ inf”: 


X.inf(X) A Yz,y,z.(£s EX A zEX A (æ<y<z) = (ye X Aaly))). 


This example can be easily extended, yielding the following. 


Proposition 3. B-regular languages are definable in MSO-+inf , and S-regular 
languages are definable in MSO+inf*. The translations are effective. 


B- and S-regular expressions. We consider the usual syntax of regular ex- 
pressions, except that apart from the usual Kleene star, which corresponds to 
unrestricted iteration, there are two new iteration operations: finite iteration, 
denoted L<°, and infinite iteration, denoted L. Formally, we define profinite 
sequences of profinite words, as profinite words over the alphabet A with an addi- 
tional separator symbol t. A profinite word x € At is an element of a profinite se- 
quence ĉ if ta} is an infix of fĉt. The concatenation of ĉ is obtained by removing 
the symbols +. We define L® (resp. LS% and L*) as concatenations of profinite 
sequences containing infinitely (resp. finitely, arbitrarily) many separators, and 
whose elements belong to L. B-regular expressions can only use the exponents 
<œ and *, while S-regular expressions can only use the exponents œo and x. 


Example 5. The B-regular expression (a<~ b)* a<% describes precisely the lan- 
guage accepted by the B-automaton A from Example 3 — “every block of a’s has 
a finite length”. The S-regular expression (a+ 6)* a° (a+0)* describes precisely 
the complement of L (A), i.e. the language accepted by the S-automaton B. 


Mimicking the standard translation from regular expressions to automata we get: 


Proposition 4. A language defined by a B-/S-regular expression is B-/S-regular. 


4 Recognizable languages 


Syntactic congruence. Just as multiplication is intimately related with regular 
languages, multiplication together with the w-power over A* turn out to be of 
central importance for B- and S-regular languages. For notational reasons, we 
view (A*,-,w) as an algebra over the signature ( - ,#), where the w-power 
of At plays the role of the operation # of the signature. Let L C At. Its 
(+ ,#)-syntactic congruence ~z is the coarsest equivalence relation over At 
which preserves multiplication, the w-power, and membership in L. 


Example 6. Let L = (a<® b)* aS% be the language of the B-automaton which 
computes the maximal length of a block of a’s. It is easy to see that the equiva- 
lence classes of ~z (and also of ~g, for K = At — L) are: 


aor, (a5 b)* ae; (a+ b)* a® (a+b)*. 


Stabilization semigroups. We consider languages L C A* whose (+ ,#)- 
syntactic congruence has a finite index. Such a set yields a finite ( - , #)-syntactic 
algebra, i.e. the quotient Sz = At /~r. Since ~7 is a congruence, the syntactic 
algebra is equipped with two operations — the usual multiplication, and stabi- 
lization, denoted #, which stems from the w-power in the profinite semigroup. 
The syntactic algebra also naturally inherits the quotient topology from At, 
which is usually non-Hausdorff, i.e. there might be singleton sets which are not 
closed. (However, if L is a closed or open language, then the quotient topology 
is To, ie. if x € {y} and y € {a} for x,y € SL, then x = y.) Multiplication and 
stabilization in Sz; are continuous with respect to the topology, and also satisfy 
several properties which are easily derived from the properties of multiplication 
and the w-power over A*. Namely, for s,t,e € S: 


s-(t-s)* = (s-t)# -s st. s# — gt 
(s#)# = s* e. ef = e” for idemptent e 
(s”)* = s# forn =1,2,3... s# €f{sr: ne N}. 


A stabilization semigroup is a To topological space S equipped with two continu- 
ous operations - and # satisfying the above axioms, apart from associativity of -. 


Example 7. Let Sz denote the quotient set induced by the language L from 
Example 6. As noted there, Sz consists of three equivalence classes, which we 
denote by [a], [b] and [a], respectively. Multiplication, stabilization and topology 
over Sz flow from the properties of the three equivalence classes: multiplication 
is commutative and each element is idempotent, [a] is the zero element and [a] 
is the neutral element; stabilization maps [a] to [a”] and s to s otherwise; [a] 
is contained in the closure of [a] and in the closure of [6]. 


Recognizability. We consider an analogue of the notion of recognizability by 
semigroups in the classical theory. Recall that a subset L C At is recognizable 
if there is a mapping a: A > S to a finite discrete semigroup such that for the 
induced homomorphism â: A* —> § we have L = 47!(F) for some F C S. 

Instead of semigroups, we deal with finite stabilization semigroups. A ho- 
momorphism & from At to a stabilization semigroup S$ is required to preserve 
multiplication and map the w-power in A* to stabilization in S. We use a notion 
of invariance of @ under infinite substitutions, which intuitively means that if a 
profinite word x is factorized into a profinite sequence of factors, and each factor 
x; is replaced by some other factor y; with â(x;) = G(y;), then, for the resulting 
concatenation y of the factors y;, â&(x) = a(y). We say that such a homomor- 
phism â: At + Sisan co-homomorphism. The following result plays a pivotal 
role in the theory, and its proof is difficult comparing to the classical case. 


Theorem 5. Let a: A — S be any mapping from a finite alphabet A to a fi- 
nite stabilization semigroup S. Then there exists a unique co-homomorphism 
â: A+ + S extending a. The mapping â is continuous. Its image is the subset 
of S generated from a(A) by the operations ( - ,#). 
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Note that the extension â is not necessarily the unique continuous homomorphic 
extension of a. We call @ the co-homomorphism induced by a. We say that a 
language L C A? is recognized by â: At > S if L = â~! (F) for some F C S; if 
additionally F is closed (resp. open) in S, we say that L is | -recognizable (resp. 
+ -recognizable). Note that a recognizable set is described in a finite manner by 
a: A> Sand F CS. It is crucial that the image of â can be computed from a. 


Example 8. Let S be the stabilization semigroup At /~L from the previous ex- 
ample, whose elements are [a], [b], [a”]. Let a: A— S map a to [a] and b to [b]. 
We will check that the quotient mapping az: A* — S is the co-homomorphism 
induced by a. We argue that az is invariant under infinite substitutions. Con- 
sider a profinite word x, and choose some factorization of x. Replace each factor 
by some other factor, with the same image under a,. Schematically: 


x= aaa aaba aaa ee ab“a baaab 
J J J ats J J 
w 
y = aaaaa (ab)” aaaaaaa -:- aaaaabaaaa aaaaabaaa 


Intuitively, it is clear that if the original word x contains no infinite block of a’s, 
then no such block can appear in the resulting word y either. Hence, az (y) = az (x). 


The proof of Theorem 5 extends the idea of Simon’s factorization trees to profi- 
nite words and stabilization semigroups, which we shortly describe. Start with 
any profinite word x. We want to determine the type of x, ie. A(x). If x is a 
single letter a, then its type is a(a). If not, we try to factorize x into a profinite 
sequence of factors, for which the type can be determined. We use three rules: 

— If x = xı - x2, and â(x1) = s1, G(x2) = s2, then A(x) = sı - s2, 

— If x factorizes into finitely many factors, each of idempotent type e, then â(x) = e, 

— If x factorizes into infinitely many factors, each of idempotent type e, then a(x) = ew, 
We prove by induction on |S] that in a finite number of steps, depending only 
on |S|, using the above three rules, any profinite word x can be iteratively split 
into single letters. Moreover, we prove that the resulting type does not depend 
on the chosen “factorization tree”. The proof of existence of factorization trees 
is similar to the proof of Simon’s theorem, and proceeds by induction on the 
size of S. The proof of uniqueness requires the use of the axioms of stabilization 
semigroups. It is similar to a proof of analogous statement in [7]. An important 
difference is that there, only finite words have factorization trees, and their 
output is unique only in an asymptotic way. 


The standard Cartesian-product construction yields several closure proper- 
ties for recognizable languages. For closure under projection, we use two en- 
hanced variants of the powerset construction, similar to constructions from [7]. 


Proposition 6. Recognizable languages are closed under Boolean combinations. 
{ -recognizable (resp. T -recognizable) languages are closed under unions and in- 
tersections. Complements of | -recognizable languages are + -recognizable and vice 
versa. | -recognizable and t-recognizable languages are closed under projections. 


By inductively applying the above to formulas of MSO-+inf, we get: 


Corollary 1. Languages definable in MSO-+inf are | -recognizable, and lan- 
guages definable in MSO+inf* are +-recognizable. The translations are effective. 
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5 The main results 


The main theorem collects the notions and results listed above, proving the 
equivalence of several characterizations. The last one is a finite-index character- 
ization of B-automata. Up to our knowledge, such a characterization has not 
been — and perhaps cannot be — phrased in the remaining frameworks. 


Theorem 7. Let L C At and K = At — L be its complement. The following 
conditions 1-9 are equivalent: 
1. L is defined by a B-regular expression, 5. K is defined by an S-regular expression, 
2. L= L(A) for some B-automaton A, 6. K = L(B) for some S-automaton B, 
8. L is definable in MSO+inf , 7. K is definable in MSO+inf*, 
4. L is t-recognizable, 8. K is | -recognizable, 
9. The (- ,#)-syntactic congruence of K has finite index and K = KN A‘). 


In the last characterization, A‘’ ”? is the set of profinite words which can be gen- 
erated from A by applying multiplication and the w-power — they are analogues 
of ultimately periodic words in the theory of w-regular languages. It follows that 
a B- or S-regular language is determined by its elements contained in A‘, 
similarly as an w-regular language is determined by its ultimately periodic words. 
By the last part of Theorem 5, the image of an oo-homomorphism to a finite 
stabilization semigroup can be computed using a fixed point calculation. Hence, 
emptiness of recognizable languages is decidable. This proves the following. 


Theorem 8. Emptiness of Boolean combinations of B-regular languages is de- 
cidable. In particular, the limitedness problem is decidable for B-automata. 


The above result extends the decidability results of Hashiguchi and Kirsten. As 
emptiness of Boolean combinations reduces to inclusion testing, it is equivalent to 
the main result of [7] — that the domination relation is decidable for B-automata. 


6 From infinite words to profinite words 


We describe a connection between w-words (i.e. mappings from N to A) and 
profinite words. Recall that any w-regular language can be presented as a finite 
union of languages of the form U - V”, where U,V C A? are regular languages 
of finite words. We generalize this observation, and provide a meta-reduction 
between the satisfiability problems for logics over w-words to corresponding logics 
over profinite words. The proof resembles Biichi’s original proof of decidability 
of MSO. Instead of the usual Ramsey lemma, we use the following observation 
(originating from [5]): For any w-word w € A” there is a factorization w = 
ug'U,-U2--+ Such that the sequence ug, u1, U2,... iS convergent to some Ug €E At. 
The proof is an easy, repeated application of the usual Ramsey lemma. 

Let V C At bea language of profinite words, and £ > 0 a real number. 
Consider the following language of infinite words VY C A”: 


VE = {ups v2: v3: due E V*: lim un = Veo and Vnd(Un, Vso) < €}. 
noo 
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For a regular language U C A? of finite words, we say that the expression 
U-V™“ is well-formed if the language U - VY does not depend on the choice 
of 0 < € < 1/n, where n is the size of the minimal automaton recogniz- 
ing U. In this case, we define the language U- V” as U - V2. For example, 
the expression (a + b)* - (a<“°b)” is well-formed and describes the language 
Lg from the introduction. For a class £ of languages of profinite words, let 
wL denote the class of all finite unions of languages defined by well-formed 
expressions U- V® with U C At regular and V € £. 

In the following theorem, by REGULAR, B-REGULAR, S-REGULAR, MSO-+inf, 
we denote the corresponding classes of languages of profinite words, and to each 
we apply the map £ ++ w£ as described above, yielding classes of languages of 
infinite words. The proof of the theorem is very general. It generalizes Biichi’s 
proof of decidability of MSO over infinite words. 


Theorem 9. Every w-regular language is in WREGULAR. Every wB-regular lan- 
guage is in WB-REGULAR. Every wS-regular language is in wS-REGULAR. Every 
MSO+B definable language is in wMSO-+inf. The translations are effective. 


The reduction described above allows to transfer results from profinite words to 
w-words. For instance, the main results of [4] (concerning wB- and wS-regular 
languages) follow from the results in our paper. More importantly, we get: 


Corollary 2. The satisfiability problem for the logic MSO+B over w-words re- 
duces to the satisfiability problem for the logic MSO-+inf over profinite words. 


We mention that by refining our Theorem 9, Skrzypczak [14] proved that a 
language of infinite words which is both wB-regular and wS-regular must in fact 
be w-regular — reflecting the immediate, analogous fact for profinite words. 


Conclusion. We presented a new proof and framework for the limitedness problem. 
We rise the question of decidability of the logic MSO-+inf over profinite words. 
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